AITopics | merged-staircase property

Collaborating Authors

merged-staircase property

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input

Neural Information Processing SystemsFeb-12-2026, 17:00:38 GMT

Optimization of neural networks has received lots of attention.

artificial intelligence, machine learning, polynomial, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > North Carolina > Durham County > Durham (0.04)

Genre: Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input

Chen, Ziang, Ge, Rong

arXiv.org Artificial IntelligenceFeb-14-2024

In this work, we study the mean-field flow for learning subspace-sparse polynomials using stochastic gradient descent and two-layer neural networks, where the input distribution is standard Gaussian and the output only depends on the projection of the input onto a low-dimensional subspace. We propose a basis-free generalization of the merged-staircase property in Abbe et al. (2022) and establish a necessary condition for the SGD-learnability. In addition, we prove that the condition is almost sufficient, in the sense that a condition slightly stronger than the necessary condition can guarantee the exponential decay of the loss functional to zero.

neural network, polynomial, theorem 3, (13 more...)

arXiv.org Artificial Intelligence

2402.08948

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > North Carolina > Durham County > Durham (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

The merged-staircase property: a necessary and nearly sufficient condition for SGD learning of sparse functions on two-layer neural networks

Abbe, Emmanuel, Boix-Adsera, Enric, Misiakiewicz, Theodor

arXiv.org Machine LearningFeb-17-2022

It is currently known how to characterize functions that neural networks can learn with SGD for two extremal parameterizations: neural networks in the linear regime, and neural networks with no structural constraints. However, for the main parametrization of interest (non-linear but regular networks) no tight characterization has yet been achieved, despite significant developments. We take a step in this direction by considering depth-2 neural networks trained by SGD in the mean-field regime. We consider functions on binary inputs that depend on a latent low-dimensional subspace (i.e., small number of coordinates). This regime is of interest since it is poorly understood how neural networks routinely tackle high-dimensional datasets and adapt to latent low-dimensional structure without suffering from the curse of dimensionality. Accordingly, we study SGD-learnability with $O(d)$ sample complexity in a large ambient dimension $d$. Our main results characterize a hierarchical property, the "merged-staircase property", that is both necessary and nearly sufficient for learning in this setting. We further show that non-linear training is necessary: for this class of functions, linear methods on any feature map (e.g., the NTK) are not capable of learning efficiently. The key tools are a new "dimension-free" dynamics approximation result that applies to functions defined on a latent space of low-dimension, a proof of global convergence based on polynomial identity testing, and an improvement of lower bounds against linear methods for non-almost orthogonal functions.

artificial intelligence, machine learning, merged-staircase property, (4 more...)

arXiv.org Machine Learning

2202.08658

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Filters

Collaborating Authors

merged-staircase property

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input

4c11f5dc728d29460f2eb7c096536726-Paper-Conference.pdf

Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input

The merged-staircase property: a necessary and nearly sufficient condition for SGD learning of sparse functions on two-layer neural networks